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Abstract. Let Hr denote an L°° normalized Haar function adapted to a dyadic rectangle 
R C [0, l] 3 . We show that there is a positive ?] < | so that for all integers n, and coefficients 
a(R) we have 

T n |«(jR)| < nHI a(R)/j R | . 

|R|=2-» |R|=2-" 

This is an improvement over the 'trivial' estimate by an amount of n~ 1 l, while the Small Ball 
Conjecture says that the inequality should hold with rj = \. There is a corresponding lower 
bound on the L 00 norm of the Discrepancy function of an arbitrary distribution of a finite 
number of points in the unit cube in three dimensions. The prior result, in dimension 3, is 
that of Jozsef Beck [1], in which the improvement over the trivial estimate was logarithmic 
in n. We find several simplifications and extensions of Beck's argument to prove the result 
above. 



1. The Principal Conjecture and the Main Results 

In one dimension, the class of dyadic intervals in the unit interval is D '.= {[j2~ k ,(j + 
l)2" ,c ) : j,k e N, < j < 2 k - 1}. Each dyadic interval has a left and right half, which are 
also dyadic. Define the Haar functions 

h '■= -llleft + bright" 

Note that we use an L°° normalization of these functions, which will make some formulas 
seem odd to a reader accustomed to the L 2 normalization. 

In dimension d, a dyadic rectangle in the unit cube [0, l] d is a product of dyadic intervals, 
thus an element of A Haar function associated to R is defined as a product of the Haar 
functions associated with each side of R, namely 

d 

h RlX ...xR d (xi, ...,x d ):= ] J h R .(xj). 

/'=1 

This is the usual 'tensor' definition. 

We will concentrate on rectangles with fixed volume. This is the 'hyperbolic' assump- 
tion, that pervades the subject. Our concern is the following Theorem and Conjecture 
concerning a lower bound on the L°° norm of sums of hyperbolic Haar functions: 
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1.1. Theorem (Talagrand [13], Temlyakov [16]). In dimension d = 2,we have 
(1.2) T" W R )\ * I Yu a{R)ilR h 

|R|=2-" |R|>2"" 

Here, the sum on the right is taken over all rectangles with area at least 2~". 
1.3. Small Ball Conjecture. For dimension d>3we have the inequality 

(1.4) T n Y \ a ( R )\ £ ^ (rf " 2) || Y a< ^> hR ■ 

\R\=2~" |R|>2"" 

This conjecture is, by one square root of n, better than the trivial estimate available 
from the Cauchy-Schwartz inequality, see § 2. As well, see that section for an explanation 
as to why the conjecture is sharp. The case of d = 2 (with a sum over \R\ = 2~ n on the 
right-hand side) was resolved by Talagrand [13]. Temlyakov has given an easier proof 
of the inequality in its present form [14], [16], which resonates with the ideas of Roth [9], 
Schmidt [10], and Halasz [6]. 

Perhaps, it is worthwhile to explain the nomenclature 'Small Ball' at this point. The name 
comes from the probability theory. Assume that X t : T — > IR is a canonical Gaussian process 
indexed by a set T. The Small Ball Problem is concerned with estimates of P(sup tgT |X t | < e) 
as e goes to zero, i.e the probability that the random process takes values in an L°° ball of 
small radius. The reader is advised to consult a paper by Kuelbs and Li [7] for a survey 
of this type of questions. A particular question of interest to us deals with the Brownian 
Sheet, that is, a centered Gaussian process indexed by the points in the unit cube [0, l] d 
and characterized by the covariance relation EX S • X t = Ylj=i min(sy, tj). The conjectured 
form of the aforementioned probability in this case is the following: 

1.5. The Small Ball Conjecture for the Brownian Sheet. In dimensions d > 2, for the 
Brownian Sheet B we have 

-logP(||B|| c([0/1]d) < e) - e-^logl/e) 2 "" 1 , £ i 0. 

In dimension d = 2, this conjecture has been resolved by Talagrand in the already cited 
paper [13], in which he used a version of (1.2) for continuous wavelets in place of Haars to 
prove the lower bound in the inequality above. In higher dimensions, the upper bounds 
are established and the known lower bounds miss the conjecture by a single power of the 
logarithm. 

Kuelbs and Li [7] have discovered a tight connection between the Small Ball probabilities 
and the properties of the reproducing kernel Hilbert space corresponding to the process, 
which in the case of the Brownian Sheet is WM^, the Sobolev space of the functions on 
[0, l] d with mixed derivative in L 2 . In Approximation Theory, the covering number N(e) 
is defined as the smallest number of L°° balls of radius e needed to cover the unit ball of 
WM 2 ,, i.e. the cardinality of the smallest £-net, a quantification of compactness of the unit 
ball in the uniform metric. The result of Kuelbs and Li states that 
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1.6. Theorem. In dimension d > 2, as e | we have 

-logP(||B||caftirt < e) - £ - 2 (logl/ £ f iff logN(e) - e^logl/e^ 2 . 

This theorem together with Talagrand's work shows that the Small Ball Conjecture 
1.3 for continuous wavelets implies the lower bound in the conjectured asymptotics of 
the covering numbers N(e) (the upper bounds are known). It is also not very hard to 
show this implication directly. The Small Ball Conjecture for the Haar functions implies 
a lower bound for the covering numbers of the space WMj. A detailed discussion of the 
connections of the Small Ball Conjecture to the Approximation Theory and other related 
areas can be found in [15], [17]. 

Even though all of the mentioned questions had been completely resolved in dimension 
d = 2, there has been very little progress in higher dimensions. The main result of 
the present paper is a partial resolution of the three dimensional case of the Small Ball 
Conjecture. We extend and simplify an approach of J. Beck [1], establishing the following 
theorem: 

1.7. Theorem. In dimension d = 3, there is a positive r\ > Ofor which we have the estimate 

(1.8) 2"" V \a(R)\ < nHI Y a(R)h R . 

^ II ^ oo 

\R\=2~ n \R\=2~" 

Beck [1] established this inequality with replaced by a term logarithmic in n, although 
Beck himself did not state the result this way, as the principal concern of that paper is on the 
question of Irregularities of Distribution, another area relevant to the Small Ball Conjecture. 

In this subject one takes to be N points in the d-dimensional unit cube, and considers 
the Discrepancy Function 

(1.9) D N (x) = (t^N n [0, x) - N|[0, x)\. 

Here [0,x) = IT/=i[0,X/) is a rectangle with antipodal corners being and x. We will 
typically suppress the dependence upon the selection of points M^. A set of points will 
be well distributed if this function is small in some appropriate function space. Thus, the 
principal concern are various lower bounds for the W norm of Djv. Many variants of this 
question are interesting; readers are encouraged to consult one of the excellent references 
in this area, e.g. [2]. The connection 1 to the Small Ball Conjecture lies in the 'hyperbolic 
orthogonal function' method initiated by Roth [9] when he proved that for all dimensions 
d>2, 

||D N || 2 > (logN)^. 

Later, Schmidt [10] has shown that in dimension 2, the L°° norm of the discrepancy function 
is much bigger than what the L 2 estimate gives us: 

||D N |L>logN. 

1 One expects extremal point distributions Mn to have about one point in each cube of volume about N~ l . 
Thus the Haar functions adapted to dyadic rectangles of about this volume are important. 
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Notice that, just like in the Small Ball Conjecture 1.3, this beats the L 2 bound by one square 
root. 

Using our method of proof, and well known facts in the literature on Irregularities of 
Distribution ([1,2]), we obtain following theorem: 

1.10. Theorem. There is a choice of < r\ < \for which the following estimate holds for all 
collections ^ c [0, l] 3 : 

(1.11) ||D N |L>(logN) 1+ ''. 

Beck's result is as above, with (logN)' ? replaced by a doubly logarithmic term in N. 
There is no further result known to the authors about the Small Ball Problem, nor the L°° 
norm of the Discrepancy Function in higher dimensions. 

Concerning the value of r\ for which our Theorems hold, it is computable, but we do not 
carry out this step, as the particular r\ we would obtain is certainly not optimal. Instead, 
the point of this proof is that the methods pioneered by Jozsef Beck are more powerful 
than originally suspected. We expect more efficient organization of the proof, and less ad 
hoc constructions, will yield quantifiable and substantive improvements to the results of 
this paper. 2 

The organization of the proof, at the highest level, and outlined in § 7, is that of Jozsef 
Beck [1]. At the same time, both the exact construction and subsequent details are in many 
respects easier than in Beck's paper. In particular, the construction in that section is a 
Riesz product construction, following the lines of § 3. But, the product, with our current 
understanding, must be taken to be 'short,' a dictation to us from the third dimension: the 
'product rule' 3.1 does not hold in dimension three. This unfortunate, and critical fact, 
forces the definition of 'strongly distinct' on us. See Definition 6.4. Still, our Riesz product 
is defined in a way to facilitate the use of Littlewood Paley inequalities and conditional 
expectation arguments, which is the source of our simplification and strengthening of 
Beck's argument. 

The principal argument begins in § 6. The earlier sections of the paper include a brief 
discussion of prerequisites for the proof. 

Acknowledgment. We have benefited from several conversations with Mihalis Kolountzakis 
and Vladimir Temlyakov on this subject. A substantial part of work by the second-named 
author was done while in residence at the University of Crete. 

2. The Trivial Bounds 

Notation. The language and notation of probability and expectation is used throughout. 
Thus, 

E/= [ f(x)dx 



Additional steps that one could take to optimize the proof are known to the authors; others are the 
subject of speculation. 
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and P(A) = El^. This serves to keep formulas simpler. As well, certain conditional 
expectation arguments are essential to us. We use the notation 

P(B | A) = P(A)" 1 P(A n B) , E(B | A) = P(A)" 1 E(A n B) . 

For a sigma field T , E(/ 1 T") is the conditional expectation of / given T . In all instances, 
T will be generated by a finite collection of atoms !F a tom S / in which case 

W\T)= £ FiA^EifW ■ 1 A . 

-^^^"atoms 

We suppress many constants which do not affect the arguments in essential ways. A < B 
means that there is an absolute constant so that A < KB. Thus A < 1 means that A is 
bounded by an absolute constant. And A — B means A < B < A. 

The inequality (1.2) with an extra square root of n is easy to prove. 



2.1. Lemma. It is the case that 
\a(R)\ ■ \R\ < nh d -v\ 

\R\=2~" 



\R\>2~" 

Proof. Each point x G [0, l] d , is in at most n d ~ l possible rectangles. This is the essential point 
dictated by the hyperbolic nature of the problem. Using this, and the Cauchy-Schwartz 
inequality, we have 



|R|=2-» 



£ \a(R)\ ■ \R\ = I Y 



\R\=2-» 
1 



< TL2 



< n2 



l 

< n2 



|R|=2-" 



1/2 



(d-1) 



(d-1) 



Y 

:|>2-« 

Y 



|R|>2-" 



□ 



Let us also see that the Small Ball Conjecture is sharp. Indeed, we take the a(R) to be 
random choices of signs. It is immediate that 



2~ n y \ a ( R )\ - nd ~ l 



\R\=2-» 



On the other hand, for fixed x 6 [0, l] d , by the properties of Rademacher functions we have 



e| Y a ( R ) h R(x) 



~ n2 



(d-1) 
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It is also well known that sums of Rademacher random variables obey a sub-Gaussian 
distributional estimate. The supremum of such sums admits easily estimated upper 
bounds. In particular, it is enough to test the L°° norm of the sum at a grid of 2 1ld points in 
the unit cube, hence we have 



E 



Y a(R)h R < -^log2' id • sup e| a(R)h R (x) 

\R\=2~" x \R\=2-" 



< n d/1 . 



Comparing these two estimates shows that the Small Ball Conjecture is sharp. In the 
trigonometric case, a similar remark has appeared in [18]. 

3. Proof of Talagrand's Theorem 

In this section we sketch the proof of V. Temlyakov [16] to the stronger inequality (1.2) 
in the case of d = 2, as this will help understand our construction for d = 3. The line of 
reasoning is similar to that of Schmidt [10]. 

The decisive point in two dimensions is that one has a 'product rule': 

3.1. Product Rule in Dimension 2. Let R,R' be two dyadic rectangles of the same area. Then, 
h R ■ h R ' G jo , 1r , ±h RnR >}. More generally, let R\, R 2 , . . .,Rkbe dyadic rectangles of equal area and 
distinct lengths in e.g. their first coordinates. Then Y\ k j =1 h Rj e jo, ±h Rin ... nRk } . 

The fact that this 'product rule' fails in higher dimensions is the most essential compli- 
cation to the resolution of the Small Ball Conjecture. 
The proof of (1.2) is by duality. Fix 

H= Y a(R)h R . 

\R\>2-" 

We will construct a function W with L 1 norm at most 1, for which the inner product 
(3.2) <H,vi/) = 2-"- 1 £ \a(R)\. 

\R\=2-" 

This clearly implies Theorem 1.1. Moreover, the function W is defined as a Riesz product: 



s=l 

V> s = YL sgn(a(R))h R . 



R:|R 1 |=2- s ,|R 2 |=2-"+ s 



Of course is non-negative. Moreover, it has L 1 norm one: expanding the product, the 
leading term is 1. All products of ip s are, by Proposition 3.1, a sum of Haar functions, hence 
have mean zero. A similar argument implies (3.2). The proof is complete. 



SMALL BALL INEQUALITY IN THREE DIMENSIONS 



7 



4. Littlewood-Paley Theory 

In this section we review some basic facts from the Littlewood-Paley Theory, which 
will be used repeatedly in subsequent sections. We state the main inequalities here to 
make the exposition self-contained. We also remind the reader that the Haar functions 
are normalized to have L°° norm one, so that our formulas are different from most of our 
references. 

It is important to our applications that we consider the Haar basis as one for vector 
valued functions. The vector space should be a Hilbert space *H, and by we mean the 
class of measurable functions / : [0, 1] — > <H such that E|/|^ < oo. 

The Haar Square Function is 



S(/) := 



2 -IV2 



\I\ 2 

IeO 1 1 



Here, (/, hi) = f hi(x)f(x)dx and E/ should be understood as Bochner integrals, and we 
are taking the Hilbert space norm of those terms that involve /. We shall be applying the 
Square Function in the cases when / is a finite linear combination of Haars, i.e. / = Y^iei a i^i, 
where I is a finite subset of D and (fli)j e j c < H. In this case, / has mean zero and the Square 
Function takes the form 

lei 

Of course we have ||/|| 2 = ||S(/)||2 just due to the fact that {l[o,i]} U {hj : I e D} is an 
orthogonal basis. 

The Littlewood-Paley Inequalities are a extension of this equality, to an approximate 
version that holds on all U , 1 < p < oo. 

4.1. Littlewood Paley Inequalities. For 1 < p < oo there are absolute constants < A p < B p < 
oo so that 



< 



B p \\S(f)\\ p , Kp<oo 
Bp < 1 + VP for p > 2. 
In the reverse direction, we have 

A,||S(/)|| P <||/|| P , i< P <oo, 

(4.3) . 

a p ^i + i/Vp^I. 

We stress that these results are delicate. 3 Burkholder [3] has shown that the best constants 
in the inequality above for general martingales are A" 1 = B p = max{p,^} - 1. However, a 



3 To prove our Theorems, we only need these inequalities with constant B p < p l for some fixed power of t. 
But, the power of t = | is the sharp result, so we use it here. 
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Haar series is not a general martingale; it is dyadic, which forces conditional symmetry. 
See [4,5,19]. 

The constants above are sharp. To see that B p - yfp is sharp for p large, just use the 
Central Limit Theorem for Rademacher random variables. 

5. Exponential Moments 

Let xjj : R — > R be a symmetric convex function with ip(x) = iff x = 0. Define the 
Orlicz norm 

(5.1) ||/||^ := inf{C > : E^(//C)<1}. 

We take the infimum of the empty set to be +oo, and denote by to be the collection of 
functions for which ||/||^, < oo. If \p(x) = x p , then ||-||^, is the usual U norm. 

We are especially interested in the class of \p given by i/ ; a (x) = e' 1 '" , |x| > 1 . We will 
write L^ a = exp(L a ). These are the exponential Orlicz classes. The following equivalence 
is well known and is based on Taylor series and Stirling's formula: 

5.2. Proposition. We have the equivalence of norms 

ll/llexpd-) * SUpp-^l/Hp - SU P A"|logP(|/| > A)|. 

p>l A>0 

The following distributional estimate holds for hyperbolic sums of Haar functions: 

5.3. Theorem. In dimension d>2we have the estimate 

(5.4) IE^L^sIIE^N'IL- 

|R|=2-» r \R\=2~" 

Of principal relevance to us is the three dimensional case, where the estimate above 
asserts that the hyperbolic sums are exponentially integrable. 

Proof. The tool is the vector valued Littlewood Paley inequality, with sharp rate of growth 
in the constants as p — > oo, stated in the previous section. As such the proof is a standard 
one, see [5,8]. We will make use of similar arguments more than once in this paper. 

Applying the one dimensional Littlewood Paley inequality in the coordinate X\ we see 
that 



2-|l/2| 



(5.5) | £ a(R)h R < VpfEl Z a{R)JlR \ ] 

|R|=2-" V |R|=2-" 

|Ril=2-'i 

If we are in dimension 2, note that due to the hyperbolic assumption, all the rectangles 
satisfying the conditions of the summation are disjoint, and thus we have: 

(5.6) | Yj a(R)h R \ 2 = Y HRfU 

\R\=2~" \R\=2~ n 
|Rj|=2- r i |Ril=2- r i 
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so our proof is complete in this case. 

In the higher dimensional case, the key point is to observe that the last term can be 
viewed as an t 1 space valued function, that is if we fix all the coordinates except %i and 
define an ^-valued function 



R? \R\=2-" }*2 ri ~ l 

|Ril=2-'i 



then the expression inside the U norm on the right hand side of (5.5) is exactly Thus, 
the Hilbert space valued Littlewood Paley inequality applies to the second coordinate, to 
give us 



n n 



| £ a(R)h R \\ £p|[££| E a ( R M 2 ] 

\R\=2~" V n=l r 2 =l |R|=2-" 

!Ry!=2"'7 , y=i,2 



'p 



Observe that we have a full power of p, due to the two applications of the Littlewood Paley 
inequalities. And if d = 3, then analog of (5.6) holds, completing the proof in this case. 

In the case of dimension d > 4 note that we can continue applying the Littlewood 
Paley inequalities inductively. They need only be used d — 1 times due to the hyperbolic 
assumption. Thus, we have the inequality 



£ a(R)h R \\ < p (d - 1)/2 \\[ £ a(K) 2 l R ] 1/2 || , 2 < p < 



oo 



\R\=2~" \R\=2- 



The implied constant depends upon dimension; the main point we are interested in is 
the rate of of growth of the U norms. Assuming that the Square Function of the sum is 
bounded in L°°, the U norms can only grow at the rate of p( rf_1 )/ 2 , which completes the 
proof. □ 

This theorem illustrates a thesis of A. Zygmund, which says that the estimates on product 
domains are controlled by the effective number of parameters, which in our hyperbolic 
setting is d - 1. The method of iteration of the one parameter inequalities, in the vector 
valued setting, is a common technique in the subject, see for instance [11,12]. We shall 
repeatedly make use of this technique in the present paper. 



6. Definitions and Initial Lemmas for Dimension Three 

As it has been already pointed out, the principal difficulty in three and higher dimensions 
is that the product of Haar functions is not necessarily a Haar function. On this point, we 
have the following higher dimensional analogue of the 'product rule' (3.1): 

6.1. Proposition. Suppose that R\,. . . ,Ri are rectangles such that there is no choice ofl < j < 
f < k and no choice of coordinate 1 < t < dfor which we have Rj it = Rj> /t . Then, for a choice of 
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sign e g {±1} we have 

k k 

(6.2) \h R = eh Sf S = (~]R k . 

7=1 ;=1 

Proof. Expand the product as 

t Id 

Yl h Rm (X V , X d ) = Y[ Yl h R,nM). 

iu=l m=l f=l 

Our assumption is that for each t, there is exactly one choice of 1 < rao < t such that 
Rm ,t = S t . And moreover, since the minimum value of \R m ,t\ is obtained exactly once, for 
m ^ Wo, we have that h Rmt is constant on S t . Thus, in the t coordinate, the product is 



hs,(x t ) h Rmt (S t ) = e t h St (xt), where e t e {±1} . 



l<mtmo<C 

This proves our Lemma. □ 

Remark. It is also a useful observation, that the products of Haar functions have mean zero, 
if the minimum value of \R m/ t\ is unique for at least one coordinate t. 

Let r G !N d be a partition of n, thus r = (r\, tj, r$), where the ri are non negative integers 
and \f] := Yut r t = n - Denote all such vectors as H n . ('H' for 'hyperbolic.') These vectors 
will specify the geometry of the rectangles, i.e. we set ^={Re D n : \Rj\ = 2~''i, j = 1, 2, 3}. 

We call a function / an r function with parameter r if 

(6.3) f=J^e R h R , £ R G{+1}. 

ReKf 

We will use f? to denote a generic r function. A fact used without further comment is that 

£ = i. 

6.4. Definition. For vectors r ; G N 3 , say that r\,...,r*j are strongly distinct iff for coordinates 
1 < f < 3 the integers {r 7/f : 1 < j < J} are distinct. The product of strongly distinct r 
functions is also an r function, which follows from 'the product rule' (6.1). 

The r functions we are interested in are 
(6.5) £:=£sgn(aOR))/i R/ 

where H„ = E|ri>2-« a(R)h R . 
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7. Jozsef Beck's Short Riesz Product 
Let us define relevant parameters by 

(7.1) q = an £ , b = \, 

(7.2) p~=aq h n~ 1 , p = ^qn~ x . 

Here, a are small positive constants, we use the notation of b = \ throughout, so as 
not to obscure those aspects of the argument that that dictate this choice of b. p is a 
'false' L 2 normalization for the sums we consider, while the larger term p is the 'true' L 2 
normalization. Our 'gain over the trivial estimate' in the Small Ball Conjecture is q b = n e ^ 6 . 
< e < 1 is a small constant; the exact determination of what we could take e equal to in 
this proof doesn't seem to be worth calculating as it surely will not be optimal. 

In Beck's paper, the value of q = q^eck = i giogn was mucn smaller than our value of q. The 
point of this choice is that cjj^ - n, with the term q q controlling many of the combinatorial 
issues concerning the expansion of the Riesz product. 4 With our substantially larger value 
of q, we need to introduce additional tools to control the combinatorics. These tools are 

• A Riesz product that will permit us to implement various conditional expectation 
arguments. 

• Attention to U estimates of various sums, and their growth rates in p. 

• Systematic use of the Littlewood-Paley inequalities, with the sharp constants in p. 

Divide the integers {1,2, ... ,n} into q disjoint increasing intervals l\, . . . , I q , and let A t := 
{re M n : I t }. Let 

(7.3) F t = Y 4 f?> 

reA, 

The Riesz product is now a 'short product.' 

i 

W.= l\(l + pF t ). 

The 'false' L 2 normalization implies that the product is, with high probability, positive, 
and thus HM/Hi « HW, with expectations being typically easier to estimate. This heuristic is 
made precise below. 

Proposition 6.1 suggests that we should decompose the product W into 

(7.4) W = 1 + ^ sd + W , 



Specifically, q c 1 is a naive bound for the number of admissible graphs, as defined in § 10. 
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where the two pieces are the 'strongly distinct' and 'not strongly distinct' pieces. To be 
specific, for integers 1 < u < q, let 

^■■=p u z z sd ri/- 

\<v t <~<v u <ci r f eA„ f t=l 

Esd 
is taken to be over all r t e A Vl 1 < t < u such that: 

(7.5) the vectors {r t '■ 1 < t <u] are strongly distinct. 
Then define 

(7.6) W sd := V Wf. 

With this definition, it is clear that we have 

(H n , W sd ) = (H n , Wf) > q h ■ n 1 ■ 2"" £ |a(R)| , 

|R|=2-" 

(7-7) 

H„= 2^ 

|R|>2-» 

q b is our 'gain over the trivial estimate', once we prove that || v l /sd ||i < 1 (estimate (7.14) 
below). Proving this inequality is the main goal of the technical estimates of the following 
Lemma: 

7.8. Lemma. We have these estimates: 

(7.9) POP < 0) < exp(-A^ 1/2 - fc ) ; 

(7.10) ||vi/|| 2 <exp(flV fc ); 

(7.11) EW = 1; 

(7.12) imk^l; 

(7.13) IIW-Mi < 1 ; 

(7.14) ||^ sd ||i < 1 . 

Here, < a' < 1, m (7.10), is a small constant, decreasing to zero as a in (7.1) goes to zero; and 
A > 1, m (7.9) zs a Zflrg-g constant, tending to infinity as a in (7.1) goes to zero. 

Proof. We give the proof of the Lemma, assuming our main inequalities proved in the 
subsequent sections. 

Proof of (7 .9). We first note that Theorem 5.3 implies that pF t is in exp(L). Then using the 
distributional estimate of Proposition 5.2, we estimate 
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pop < o) < y p(p f, < -i) 



= V P(pF t < -fl-^ 1/2 - fc ) 
f=i 

< exp(-ary /2 " fc ) . 



Proof of (7.10). The proof of this is detailed enough and uses the results of subsequent 
sections, so we postpone it to Lemma 9.1 below. 

It is important for our purposes in the proof of the current Lemma to note that Lemma 9.1 
proves a uniform estimate, namely 

(7.15) sup E ]~f (1 + pFt) 2 £ exp(a'q 2b ) . 

Proof of (7.11). Expand the product in the definition of W. The leading term is one. Every 
other term is a product 



keV 

where V is a non-empty subset of {1, ... , q}. This product is in turn a linear combination of 
products of r functions. Among each such product, the maximum in the first coordinate 
is unique. This fact tells us that the expectation of these products of r functions is zero. So 
the expectation of the product above is zero. The proof is complete. 

Proof of (7.12). We use the first two estimates of our Lemma. Observe that 

< 1 + 2P(W < O^H^Ib 

< 1 + exp(-Aq V2 - b /2 + a' q lh ) . 

We have taken b = 1/6 so that 1/2 — b = 2b. For sufficiently small a in (7.1), we will have 
A > a'. We see that (7.12) holds. 

In light of the estimate (7.15), we see that the argument above proves 



(7.16) sup I ](l + pF f ) 

Vc{l,...,q) veV 



< 1. 

1 



Proof of (7.13). The primary facts are (7.16) and Theorem 10.1; we use the notation 
devised for that Theorem. 
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Note that the Inclusion-Exclusion principle gives us the identity 
*JT= Y (-l) |y|+1 Prod(NSD(V)) • (l + p?t)- 



Vc\l,...,q) 

\V\>2 



te{l,...,q)-V 



We use the triangle inequality, the estimates of Lemma 9.1, Holder's inequality, with 
indices 1 + l/q 2h and q 2h , and the estimate of (10.2) in the calculation below. Notice that we 
have 

iitt ~ (i-<r 2f, )/(i+f 2 ') 
sup [(1 + pPt) 



sup 7(1 + P F t ) 

Vc{l,...,q) veV 



< 



Vc{l,...,q) u veV 

x iri (i+ p Ff) 

veV 

< exp(fl'/(l + q- lh )) < 1 

We now estimate 

imii< Yj |Prod(NSD(V)) • (1 + pFt) 

t£{l,...,q)-V 



q- 2b /(l+q- 2b ) 
2 



Vc{l,...,q) 
\V\>2 



< IIProd(NSD(V))||, 



Vc{l,...,q) 
\V\>2 



(1 + Wt) 

te[l,...,q}-V 



1+q- 



2 

<>y[q c 'n- K ] v <n- E ' <1. 



v=2 

Proof of (7. U). This follows from (7.13) and (7.12) and the identity W = 1 + W sd + and 
the triangle inequality. 

□ 

8. The Beck Gain in the Simplest Instance 

Beck considered sums of products of r functions that are not strongly distinct, and 
observed that the L 2 norm of the same are smaller than one would naively expect. This 
is what we call the Beck Gain. A product of r functions will not be strongly distinct if the 
product involves two or more vectors which agree in one or more coordinates. In this 
section, we study the sums of products of two r functions which are not strongly distinct. 
A later section, § 10, will study the general case. The results of this Section are critical to 
the next section, in which we bound the L 2 norm of our Riesz product. 

In this section, and again in § 10, we will use this notation. For a subset C c H^, let 

k 

(8.1) Prod(C):= £ Y[f rj . 

(/-i,...A.)eC /=1 
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In this section, we are exclusively interested in k = 2. 

Let C(2) c consist of all pairs of distinct r vectors {r\, r 2 \ for which r^ 2 = r 2/2 . J- Beck 
calls such terms 'coincidences' and we will continue to use that term. We need norm 
estimates on the sums of products of such r vectors. 

8.2. Lemma. [The Simplest Instance of the Beck Gain.] We have these estimates for arbitrary 
subsets C c C(2) 



The second estimate of the Lemma appears to be sharp, in that the collection C(2) has 
three free parameters, and the estimates is in terms of n 3 ^ 2 . Note that for p - n we have 



And the latter term can be as big as n 3 , which matches the bound above. Thus we only 
need to deal with the case p < n. 

The proof of the Lemma requires we pass through an intermediary collection of four 

tuples of r vectors. Let B(4) c Mf, be four tuples of distinct vectors if, s,t,u) for which (i) 
r 2 = s 2 and t 2 = u 2 ; and (ii) in the first and third coordinate the maximum is achieved twice. 

Proof. The method of proof is probably best explained by considering first the case of p = 2. 
Observe that 

||Prod(C)||l = EProd(B) + EProd(I), 

where B = C x C n B(4) and B is a collection of four-tuples in C x C in which some of the 
vectors completely coincide. Indeed, the main point is that 



iff the maximum is not unique in each coordinate. But, if the vectors are distinct, this is 
the definition of B(4). Thus the case p = 2 follows almost immediately from Lemma 8.6 

below, since E Prod(B) is easy to estimate. 

Now, let us consider p > 4. Each pair (r*,s) € C must be distinct in the first and 
third coordinates. Therefore, we can apply the Littlewood Paley inequalities in those 
coordinates, very much in the same fashion as it was done in the proof of Theorem 5.3, to 
estimate 



(8.3) ||Prod(C)|| p < p 5/ V /4 . 

Moreover, if we have C = C(2) n A s x A t for some < s,t < q we have 

(8.4) ||Prod(C)|| p < p 3/2 n 3/2 . 



||Prod(C 2 )||„ - ||Prod(C 2 )|| 



CO 




a,b 



(r,s)eC 
max(ri,si)=a 
max(r3,s 3 j=fc 
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Here, we have a full power of p, as we apply the Littlewood Paley inequalities twice. 
Observe that 

Z| Z frf( = K+ Yj Prod(C y ) + Prod(B) . 

a,b (fflec i*/e{l ; 2,3 ; 4] 
maxjcj ,s 1 }=a 
maxjr 3/ s 3 }=fc 

The term fjC arises from the diagonal of the square. The terms Cy are 

Cy := {(ri, r2/ ^3/ ^jeCxC : ^ = r ; , and the other two vectors are distinct}. 

Note that by definition, Cj^ = Q,4 = 0, in other cases, the Q /7 are of the same class of 
objects as C. The term B we have already defined. 

Then, we can estimate by the triangle inequality, and the sub-additivity of x \- > yfx, 

(8.5) p- 1 ||Prod(C)|| p <(tjC) 1/2 + Yu IIProd(C !/7 )||^ + ||Prod(B)||^. 

!</e( 1,2,3,4) 

This inequality is useful for induction. 

Let us consider the case of (8.4). We have already seen that N(2) < ft 3 / 2 . Hence (8.5) 
implies that for p = 2 V+1 



N(2 V+1 ) < 2 1,+1 {ft 3/2 + 4N(2 B ) 1/2 } . 



Clearly, this can be recursively applied, to yield a proof of (8.4) in the case p < n. But the 
case of p > ft is trivial, as the L°° norm of the terms we are estimating are at most n 3 

□ 

8.6. Lemma. For any subset B c B(4) 

(8.7) ||Prod(B)|| p < y/pn 711 . 

If we do not consider arbitrary subsets, the estimate improves. We have the following 

(8.8) ||Prod(B(4) n (A s x A f ) 2 )|| P < p n 3 , 

This Lemma, with exponents on n being n 7 ^ 2 appears in Beck's paper [1], in the case of 
p = 2. The U variants, following from consequences of Littlewood-Paley inequalities, are 
important for us. 

The first estimate is recorded, as it is interesting that it applies to arbitrary subsets of 
B(4). We will rely upon the second estimate. Pointed out to us by Mihalis Kolountzakis, 
this estimate is better for all ranges of p < n. 

Proof. We discuss (8.7). The proof is a case analysis, depending upon the number of 
{r, s, t, u\ at which the maximums occur in the first and third coordinates. We proceed 
immediately to the cases. 

Let B 2 c B consist of those four-tuples {r, s, t, u\ for which 

r 1 = t 1 = maxjf!, Si, t lr u x ) , ^3 = ^3= max{r 3 , s 3 , t 3> u 3 } . 
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This collection is empty, for necessarily we must have r 2 = s 2 = t 2 = u 2 , but then r = s, as 
the parameters of all vectors is n. This violates the definition of B. 

Let B 3 cB consist of those four-tuples [f, s,t,u} for which 

h = max{ri, s lr h, u x ) , r 3 = u 3 = max{r 3 , s 3/ f 3 , u 3 } . 



That is, the maximal values involve three distinct vectors. These four vectors can be 
depicted as 





' n(n) ' 






-> 

, t = 






' □ ^ 


r = 


r 2 


-* 

, s = 




h 


-* 

, u = 


h 




, r 3 , 




{ □ J 




{ ° J 







A □ denotes a parameter which is determined by other choices. It is essential to note that 
choices of r 2 and r 3 determine the value of Y\ (hence the □ in the first coordinate for r), and 
so the vector r. The only free parameters are (say) S\, denoted by an * above. 

But, note that we must then have |s| = S\ + s 2 + s 3 < n. Therefore this case is empty. 

Let B4 be those four-tuples {r, s, t, it\ G B such that S\ = U\ and r 3 = t 3 . That is there are 
four vectors involved in the maximums of the second and third coordinates. These four 
vectors can be represented as 



(8.9) 



( □ ^ 

>'3 



□ 



( □ 

h 

r 3 



□ 



The next argument proves (8.7). Let B^(a,a',b) be those four tuples {r,s,t,u} £ B such 
that 

r 2 = s 2 = a , t 2 = u 2 = a' , Sx = U\ = b . 

The point to observe is that 

||Prod(B 4 (fl / fl' / &))||p < CVp V"- 

As there at most < n 3 choices for a, a', b this will prove the Lemma. 

Indeed, we have not specified r 3 = f 3 . Since all vectors are distinct, we can assume 
without loss of generality that a < a' (and thus Y\ > fx) and in considering the norm above, 
we ignore s*and it, as they are completely specified by the datum (a,a',b). We apply the 
Littlewood-Paley inequality in the first coordinate to the product f? ■ 

2,X/2| 



Mr Mr* Vp||E| Z b'f\\ I = VpV^, 



f,t: 
h<r x =c 



since r and t are completely specified once r\ is fixed. The proof of (8.7) is finished. 
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We turn to the proof of (8.8), arguing similarly. We have already seen that the only 
non-empty case is B 4 . Let B 4 (a,a') be those four tuples {r,s, t,it\ e B 4 such that 

r 2 = s 2 = a , t 2 = u 2 = a' . 

The point to observe is that 

||Prod(B 4 (fl,fl'))llp <Cpn. 

As there at most < n 2 choices for a, a' this proves the Lemma. 

The point is that Prod(B 4 (fl,a')) almost splits into a product. Namely, if we define 

Prod(B 4/1 (fl,fl')) := {{r,f| : r 2 = a, t 2 = a', r 3 = £ 3 ), 

Prod(B 4/2 (fl,fl')) := {{s,u\ '■ S2 = a, u 2 = a', Si = Wij, 
we will have 

(8.10) Prod(B 4 (fl,fl')) = Prod(B 4/ i(fl,fl')) • Prod(B 4/2 (a,a')) - Prod(M), 

where M c (B 4/ i(a,a')) x (M^ 2 (a,a')) consists of quadruples in which the coincidence either 
in the first or the third coordinate is not a maximum in that coordinate. 
We first prove the estimate 

(8.11) ||Prod(B 4A (fl,fl'))ll 2p < y/p-n 1/2 , k = l,2. 

We may assume without loss of generality that k = 1, and a > a'. The pairs in Prod(B 4/ i (#,«')) 
consist of the two vectors r and t in (8.9). These two vectors are parameterized by t\, say. 
Since a = r 2 < a' = t 2 , and r 3 = t 3 , the hyperbolic assumption implies t\ is the maximal 
coordinate. Therefore, the Littlewood-Paley inequality in this coordinate applies. 

Now we deal with the term Prod(M). For this, assume that in the first coordinate the 
maximum is achieved at T\. This situation is depicted below: 



(8.12) r = 



max 




f 51 ] 








( si ) 




-> 




-> 

, t = 




-* 




a 


, s = 


a 


a' 


, u = 


a' 






* 

\ ) 




, r 3 , 




* 

\ J 



Notice that in this situation the maximum in the third coordinate cannot be r 3 = f 3 , for 
we would then have Si + S2 + S3 < Y\ + r 2 + r 3 = n. So, the maximum in this coordinate 
is S3 or 1/3. Also notice, that with a and a' fixed, choosing the values of Y\ and S3 (or U3) 
completely determines the quadruple of vectors. Thus we can apply the Littlewood-Paley 
inequality twice in the first and the third coordinates, which would yield 

(8.13) ||Prod(M)|| p < ( Vp V^) 2 = pn. 

Combining (8.10), (8.11) and (8.13), we see that we have proved 

||Prod(B 4 (fl / fl'))ll P ^ V n - 
The proof is complete. 
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□ 

There is another corollary to the proof above required at a later stage of the proof. For 
an integer a, let B fl (4) c be four tuples of distinct vectors (r,s, t, it) for which (i) r 2 = s 2 
and t 2 = u 2 ; and (ii) in the first coordinate we have Si = Ui = a; and (iii) two of the four 
vectors agree in the third coordinate. 

8.14. Lemma. For any integer a, and subset B c B fl (4) we have 

(8.15) ||Prod(B)|| p < pn 5/2 . 

The point of this estimate is that we reduce the number of parameters of B(4) by one, 
and gain a full power of n in the size of the U norm, as compared to the estimate in (8.7). 

Proof. In the proof of Lemma 8.6, in the analysis of the terms B 4 we used the triangle 
inequality over the term b = S\ = U\. Treating this coordinate as fixed, we gain a term n" 1 
in the previous proof, hence proving the Lemma above. 

□ 

A further sub-case of the inequality (8.3) demands attention. Using the notation of 
Lemma 8.2, let 

(8.16) C 2/h := {(n, A) e C 2 : r u = b), \<a<n. 

Thus, this collection consists of pairs of distinct vectors, with a coincidence in the second 
coordinate, and the first coordinate of r\ is fixed. Note that these collections of variables 
have two free parameters. At L 2 we find a 1/4 gain over the 'naive' estimate. 

8.17. Lemma. For any b and any subset C c C 2 ,i, we have the estimates 

(8.18) ||Prod(C)|| p <p-n 5/ \ 2<p<oo. 

Proof. As in the proof of Lemma 8.2, we begin with the case p = 2. Observer that 

||Prod(C)|| 2 = EProd(B), 

where B = C 2f b x C 2 ,6 n B&(4), with the last collection defined in Lemma 8.14. Therefore, 
the Lemma in this case follows from that Lemma. 

More generally, no pair of vectors in C 2 ,b (2) can have a coincidence in the third coordinate, 
so we can use the Littlewood Paley inequalities in that coordinate to estimate 

||Prod(C)|| p < VP 

Observe that 
(8-19) El Z k 

c (n/ 2 )eC 
max(r u / 2 , 3 )=c 



E E a-a] 



2-.1/2I 



c (r a / 2 )eC 



/^| 2 = tjC+ Yj Prod(C y ) + Piod(B). 

i<je{ 1,2,3,4) 
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Similar to before, we define the collections Q v as follows. 

Ci r j := {(r\, r 2 , r 3 , r 4 ) e C x C : r, ■ = r ] • , and the other two vectors are distinct}. 
In this case, observe that five of these collections are empty, namely 

Cl / 2 = C 2/ 3 = C 1/ 4 = C 2/3 = C 2 ,4 = 0. 

The only non-empty collection is €1,3. Yet, in €1,3, the vectors r 2 and have a coincidence 
in the first coordinate. Thus, Lemma 8.2 applies to Ci /3 , so that we have the estimate 

(8.20) ||Prod(C 1/3 )|| p <p 5/4 tt 7/4 . 

Let us prove (8.18). Combining these observations with (8.19) and Lemma 8.14 we see 
that 

p-^IIProdtQUp < n + ||Prod(C^)||^ + ||Prod(B)||J£ 

<n + p 5/8 n 7/8 +p 1/2 n 5/4 . 

Concerning the right hand side, note that for 2 < p < n 3 , we have p 5 ^ 8 n 7 ^ 8 < p 1 ^ 2 n 5/ ' 4 . Hence 
we have proved 

||Prod(C)|| p <pn 5/ \ \<p<n 3 . 

Yet, for p > n the U norm above is comparable to the L°° norm, so we have finished the 
proof of (8.18). 

□ 

9. The L 2 Norm of the Riesz Product 
We now prove a central estimate of the proof. 
9.1. Lemma. The estimate (7.10) holds. Moreover, we have 

(9.2) sup E TT (1 + pF t ) 2 < exp(a'q 2b ) . 

mi <?) fev 

Here, p is as in (7.2), and a' is a fixed constant times < a < 1, the small constant that enters into 
the definition of p. 

Remark. A conditional expectation argument is essential to this proof. This Lemma is also 
proved in Beck's paper, using a much more involved argument: his more complicated 
Riesz product precludes our simpler line of reasoning. 

Proof. The supremum over V will be an immediate consequence of the proof below, and 
so we don't address it specifically. 

Let us give the initial, essential observation. We expand 

q q 

E + pF t ) 2 = E + 2pF t + (pF t ) 2 ) • 



SMALL BALL INEQUALITY IN THREE DIMENSIONS 

Hold the x 2 and x 3 coordinates fixed, and let T be the sigma field generated by F lf . . . ,F q 
We have 

E(l + 2pF q + (pF q ) 2 1 70 = 1 + n(pF q ) 2 1 T) 

(9.3) _ H H 

where T f := ^ 

n=si 

Then, we see that 

9 9-1 

E f](l + 2pF f + (p~F f ) 2 ) = E{f](l + 2pF t + (pF t f) x E(l + 2pF q + (pF q ) 2 \ T)} 

f=i t=i 

q-l 

(9.4) < (1 + oV^E [](1 + 2pF f + (pF f ) 2 ) 

t=i 

(9-5) +|E[](l + 2pF t + (pF t ) 2 )-p 2 r i? |. 

This is the main observation: one should induct on (9.4), while treating the term in (9.5) 
an error, as the 'Beck Gain' estimate (8.4) applies to it. 

Let us set up notation to implement this line of approach. Set 

v 

N(V;r):=\\ ](l + pF f )| , V = l,...,q. 

t=i ' 

We will use the trivial inequality available from the exponential moments 



N(V;4)<Y[\\l + pFtWw 



t=i 

_&-1/2t/\V 



< (i + cq°- 1/z vy 

This of course is a terrible estimate, but we now use interpolation, noting that 

(9.6) N(V;2(1 - 1/q)- 1 ) < N{V;2) x - llc < • N(V;4) 1/( ? . 

We see that (9.4), (9.5) and (9.6) give us the inequality 

N(V + 1;2) < (1 + a 2 q 2b - l ) 1/2 N(V;2) + C ■ N(V;2(1 - l/q)' 1 ) ■ \\p^T q \\ q 

(9.7) < (1 + a 2 q 2b - l ) 1/2 N(V; 2) + CN(V; 2) 1 ~ 1 ^ ■ N(V; 4) 1/, ?||p 2 r (? || (f 
< (1 + a 2 q 2h - l ) ll2 N{V;2) + QfrT^NiV;!) 1 - 1 ^ . 

In the last line we have used the inequality (8.4). 
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Of course we only apply this as long as N(V;2) > 1. Assuming this is true for all V > 1, 
we see that 

A%2) < (1 + a 2 q 2h ~ 1 + Gq c n V2 f 
< e a 'f b . 

Here of course we need Cq c n~ l l 2 < aq 2b ~ 1 , which we certainly have for large n. 

□ 

10. The Beck Gain 
Let us state the main result of this section. Given V c {1, . . . , q\ let 

NSD(V) := {{fy : ; e V) e Xy 6 y Ay | for each j e V, there is a choice of f eV — {/} 

and £ = 2, 3 so that ry/ = r^A . 

That is, we take tuples of r vectors, indexed by V, requiring that each fy be in a coincidence. 
Such sums admit a favorable estimate on their L 2 norms. 

10.1. Theorem. [The Beck Gain.] There are -positive constants C , Q, C 2 , C 3/ k for which we 
have the estimate 

(10.2) p |y| ||Prod(NSD(y))|| p < [C \V\ c ^ c ^n- K f l , Vc{l,...,q}. 

Remark. The novelty in this estimate is that we find that (a) the gain can be given in 
a manner proportional to |V| and (b) the gain also holds in U norms. In application, 
p ^ q 2b = q 1 ^ — n e ', so the polynomial growth in p and in q is acceptable to us. 5 

The proof of this Theorem requires a careful analysis of the variety of ways that a product 
can fail to be strongly distinct. That is, we need to understand the variety of ways that 
coincidences can arise, and how coincidences can contribute to a smaller norm. 

Following Beck, we will use the language of Graph Theory to describe these general 
patterns of coincidences, although there is no graph theoretical fact that we need. Rather, 
the use of this language is just a convenient way to do some bookkeeping. 

The class of graphs that we are interested in satisfies particular properties. A graph G is 
the triple of (V(G), E 2 , E 3 ), of the vertex set V(G) c {1, . . . , q), and edge sets E 2 and E 3 , of color 
2 and 3 respectively. Edge sets are are subsets of 

E ; c V(G) x V(G) - {(k,k) \ k e V(G)} . 

Edges are symmetric, thus if (v,v') e Ey then necessarily (v',v) £ Ey. 

A clique of color j is a maximal subset Q c V(G) such that for all v + v' € Q we have 
(v,v r ) G Ey. By maximality, we mean that no strictly larger set of vertices Q' D Q satisfies 
this condition. 

5 Beck [1] found a gain in L 2 norm of order n -1 ^ 4 , for all V. Such a small gain of course forces a much 
shorter Riesz product. 
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Call a graph G admissible iff 

• The edges sets, in both colors, decompose into a union of cliques. 

• Any two cliques Q 2 in color 2 and clique Q3 in color 3 can contain at most one 
common vertex. 

• Every vertex is in at least one clique. 

A graph G is connected iff for any two vertices in the graph, there is a path that connects 
them. A path in the graph G is a sequence of vertices v\, . . . , Vk with an edge of either color, 
spanning adjacent vertices , that is (vj,Vj+i) e E 2 U E 3 . 

Reduction to Admissible Graphs. It is clear that admissible graphs as defined above are 
naturally associated to sums of products of r functions. Given admissible graph G on 
vertices V, we set X(G) to be those tuples of r vectors 

veV 

so that if (v, v') is an edge of color j in G, then r Vi j = r V ' r j. 

We will prove the Lemma below in the following two subsections. 

10.3. Lemma. For an admissible graph G on vertices V we have the estimate below for positive, 
finite constants Co, C\, C 2 , C3, k: 

(10.4) p |v| ||Prod(X(G))||i < [C \V\ Cl p C2 q C3 n- K ] m , 2 < p < 00 . 

Let us give the proof of Theorem 10.1 assuming this Lemma. Our tool is the Inclusion- 
Exclusion Principle, but to apply it we need additional concepts. 

Given two admissible graphs G\, G 2 on the same vertex set V, let G\ A G 2 be the smallest 
admissible graph which contains all the edges in G\ and in G 2 . By smallest, we mean the 
graph with the fewest number of edges; and such a graph may not be defined, in which case 
we take G\ A G 2 to be undefined. We recursively define G\ A • • • A Gk '■= (Gi A • • • Gk-i) A Gk- 
This wedge product is associative. 

Let Qq be the set admissible graphs on V which are not of the form G\ A G 2 for admissible 
Gi + G 2 . These are the 'prime' graphs. (If V is of cardinality 2 or 3, every graph is prime.) 
Now define to be those graphs which are equal to a wedge product G\ A • • • A G k , with 
Gj s Qq, and moreover, k is the smallest integer for which this is true. Clearly, we only 
need to consider k < q. 

Then, by the inclusion-exclusion principle, 

1 

(10.5) Prod(NSD(y)) = ^(-1)* ^ Prod(X(G)) . 

k=Q Ge§ k 

The number of admissible graphs on a set of vertices V is at most 2' y '|F|! < 2' y '|V r |' y '. So 
that using (10.4) clearly implies Theorem 10.1. 
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Norm Estimates for Admissible Graphs. We begin this section with a further reduction 
to connected admissible graphs. Let us write G e BG(C , G\, C 2 , C 3 , x) if the estimates 
(10.4) holds. ('BG' for 'Beck Gain.') We need to see that all admissible graphs are in 
BG(C , Ci, C 2 , C 3 , k) for non-negative, finite choices of the relevant constants. 

10.6. Lemma. Let C , C\, C 2 , C 3 , k be non-negative constants. Suppose that G is an admissible 
graph, and that it can be written as a union of subgraphs G\, .. .,G] c on disjoint vertex sets, where 
all Gj g BG(C , Ci, C 2 , C 3 , k). Then, 

G e BG(C 0/ Ci, C 2/ C 2 + C 3/ k) . 

With this Lemma, we will identify a small class of graphs for which we can verify 
the property (10.4) directly and then appeal to this Lemma to deduce Theorem 10.1. 
Accordingly we modify our notation. If Q is a class of graphs, we write Q c BG(k) if there 
are constants Q, C\, C 2 , C 3 such that Q c BG(Co, Ci, C 2 , C 3 , k). 

Proof. We then have by Proposition 10.7 

k 

Prod(X(G)) = Y[ Prod(X(G ; )) . 

;=i 

Using Holder's inequality, we can estimate 

k 

p |v| ||Prod(X(G))|| p < []p l ^ l ||Prod(X(G ; ))||, p 

;'=i 

A: 

< Y[[Co(kp) Cl q C2 n- K fi l 

;'=i 

< [C p Cl ^ C2+Cl n- K ] |y| . 

Here, we use the fact that since the graphs are non-empty, we necessarily have k < q. 

□ 

10.7. Proposition. Let G\, . . . ,G p be admissible graphs on pairwise disjoint vertex sets V\, . . . ,V p . 
Extend these graphs in the natural way to a graph G on the vertex set V = [j V t . Then, we have 

v 

Prod(X(G)) = Y[ Prod(X(G f )) . 

Connected Graphs Have the Beck Gain. We single out for special consideration the con- 
nected admissible graphs G . Let ^connected be the collection of of all admissible connected 
graphs on V c {1, . . . , q). 

10.8. Lemma. We have ^connected c BG(A). 
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We will have to pay special attention to the case of 2 and 3 vertices . It is important 
to observe that the first coordinates are necessarily distinct, and have the partial order 
inherited from the vertex set V. Namely the vertex set V c {1, . . . , q}, and V inherits the 
order from the integers. By the construction of our Riesz product, the first coordinates 
inherit this same order. 

General Remarks on Littlewood-Paley Inequality. These remarks are essential to our analysis 
of this lemma, and the Theorem we are proving. The vertex set V is a subset of {1, . . . , q} 
and it inherits an order from that set. Moreover, the tuples of r vectors do as well. Namely, 
writing 

V = {vx < ■ • • < v t \, 

for {f\, .. .,r c \ G X(G), we have, by construction, r\ t \ < ••• < r^. This since r m (1 £ I Vm , where 
I m > is the increasing sequence of intervals of length equal to n/q that partition {1, . . . , n\. 

There is a natural way to apply the Littlewood-Paley inequalities. For integer be e Ig, let 
X(G; b e ) be the tuple of r vectors {r\, ...,r e } such that r ( ,\ = bf. We have 



(10.9) ||Prod(X(G))|| p < ^|[2jProd(X(G;fc,))| 2 ] 




It is tempting to continue this procedure, by applying the Littlewood-Paley inequality 
again to the vertex Vg-i. Yet — and this in an important point — due to the nature of r 
functions, this option is blocked to us. The vertex V{ is in at least one clique Q of, say, color 
2. We could choose a value Cq for that clique, thereby specifying all coordinates of the 
vector ?(. Set X(G; be; Cq) be the tuple of r vectors {r\, ... , rt-\\ such that 

{r\, ...,r t -x, (b e , c Q , n-b ( - c Q )} e X(G; b e ) . 

Here, X(G; bf, c q ) consists of tuples of length t— 1, since the vector ft is completely specified. 
Thus, we see that 



(10.10) ||Prod(X(G))|| p < Vp-"sup||[^Prod(X(G;^;c Q )) 2 ] 




At this point, the (Hilbert space) Littlewood-Paley inequalities will again apply. 

We will refer to the notation above. Keep in mind that b is for the coordinates specified 
by a Littlewood-Paley inequality; c are for the coordinates in a coincidence that we use the 
triangle inequality on. We shall return to these themes momentarily. 

Proof of Lemma 10.8. We begin the proof with a discussion of the case of two and three 
vertices , which will not be susceptible to the general methods related to the Littlewood- 
Paley inequality outlined above. 
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The Case of Two Vertices . Notice that if G consists of only two vertices , the relevant estimate 
is (8.4). Namely, we have 

||Prod(X(G))|| p < Cp 3/2 n 3/2 . 

Equivalency, G e BG(C , 3/4,0, 1/4). 

The Case of Three Vertices. The case of G e ^connected having three vertices depends critically 
on the same phenomena behind the Beck Gain for graphs on two vertices . We will deduce 
this case as a corollary to the case of two vertices . 

There are three distinct sub-cases. The more delicate of the two cases is as follows. The 
graph is depicted as 



(10.11) 



Vi v 2 v 3 

□ □ □ 
• = • 

• = • 



where V\ < v 2 < v 3 . (The case of v 2 < V\ < z? 3 is entirely the same, and we don't discuss it 
directly.) 

By our general remarks on the Littlewood-Paley inequality, this inequality applies in the 
first coordinate, to the vertex y 3 . Using the notation in (10.9), we have 

||Prod(X(G))||„ < ^||[^|Prod(X(G;fc 3 ))| 2 ] 1/2 || . 

heI S3 P 

The vectors v 2 and c 3 have a coincidence in the third coordinate. Therefore, we specify the 
value of the coincidence to be c 3 and estimate 

(10.12) ||Prod(X(G))|| p < Vp-"-sup||[2] ( Prod(X(G;&3;c3)) 2 ] 1/2 || . 

h v 



C3 



Recall that X(G; fc 3 ; c 3 ) consists only of pairs of vectors. This graph can be depicted as 

Vi v 2 
□ □ 
• = • 

But this is the case considered in (8.18). From that inequality, we see that we have the 
estimate 

||Prod(X(G;fc 3 ;c 3 ))|| p < ^n 5/i . 

Therefore, 



|[^Prod(X(G;& 3 ;c 3 )) 2 ] 1/2 || < V^sup|||Prod(X(G;fc 3 ;c 3 ))||| f 



< 



7/4 
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Here we have crudely estimated the t 1 sum in (10.12). Combining the last estimate with 

(10.12) , we see that 

(10.13) ||Prod(X(G))|| p < p 3/ V 1/4 . 

Recall that the point of comparison is to p" 3 = n 3 q~ 3 ^ 2 , and the estimate above is smaller 
by n" 1/4 . Thus the class of graphs given by (10.11) are contained in BG(^). 

The other case is when the graph can be depicted by 

V\ v 3 v 2 
□ □ □ 
• = • 

• = • 

where V3, the maximal index is in both cliques. This case is much easier, as one application 
of the Littlewood Paley inequality, and the triangle inequality will determine the value of 
both cliques. It is very easy to see that this class of graphs is in BG(l/6), and the details 
are omitted. The third case is even easier - it involves the graphs which have a clique of 
size three in one of the coordinates. Hence the discussion of graphs on three vertices is 
complete. 

A General Estimate. We now present a general recursive estimate for the U norm of 
Prod(X(G)), assuming that G is a connected graph on at least four vertices. Write V 
as 

V = {vi < • • • < v e } . 

The estimate is obtained recursively. Along the way we will construct two disjoint 
subsets V3/2, V1/2 c V. V 3 / 2 will be the vertices to which we apply both the Littlewood 
Paley and triangle inequalities, thus these vertices contribute n 3 / 2 ^" 1 ^ 2 to our estimate. V1/2 
will be the vertices to which we apply only the Littlewood Paley inequality, thus these 
vertices contribute (n/q) 112 to our estimate. Those vertices not in V 3/2 U V 1/2 will be those 
which are determined by earlier steps in the procedure. They contribute nothing to our 
estimate. In estimating an U norm, the power of p is one-half of the number of applications 
of the Littlewood-Paley inequality, namely 5^(^3/2 U ^1/2)- 

The purpose of these considerations is to prove the estimate 

(10.14) ||Prod(X(G))|| p < (C yp)! 1/ 3/2i+!v 1/2 j (w/i? s > (|V3 /2 | + |^ 1/2 |)/2 H !v 3/2 i _ 
Initialize 

^3/2^0, Vl/2<-0, £fixed^0. 

The last collection consists of those cliques which are specified by earlier stages of the 
argument. 
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6e{l,.. v n} v 3/2 uV 'i/2 



1/2 



Base Case of the Recursion. We update V 3 / 2 <— {vc\, since it is the maximal element. We 
update (Sfixed to those cliques which contain V{. Then (10.15) is a consequence of (10.10). 
Recursive Case. At this point, we have the datum V 3 / 2/ V\/2, and Qn xe d- We also have 

datum be {!,..., n} V3 / 2UVl ' 2 , and c e {1, . . . , n}* 3 "^. Notice that this datum can completely 
specify some r vectors associated to vertices not in V3/2 U Vi/ 2 — think of a vertex that is in 
two cliques in Qa xe d- 

The recursion stops if every vertex Vk is determined by this datum. Otherwise, let k be 
the largest integer such that r Vk is not determined by this datum. If no clique in Qa xe d contains 
Vk update 



and update (2fi xe d to include those cliques which contain v^. By application of the Littlewood- 
Paley inequality and the triangle inequality, the estimate (10.15) continues to hold for these 
updated values. 

If some clique in (2fi xe d contains v^, then there can be exactly one clique Q Vk which does, for 
otherwise r Vk would have been completely specified by these two cliques. Update 

V 1/2 <- VlIZ u HI , 

and update (3fi xe d to include all cliques which contain v\. By application of the Littlewood- 
Paley inequality, the estimate (10.15) continues to hold for these updated values. 

Once the recursion stops the inequality (10.15) holds. But note that we necessarily have 



as all r vectors are completely determined by b and c. Therefore, we have proven (10.14). 

The Conclusion of the Proof. Since V 3 / 2 and Vi/ 2 are disjoint subsets of V, we have proven 
the inequality 



And the remaining analysis concerns the exponent on n above, namely we should see that 



for a fixed positive choice of k, and all connected graphs G on at least four vertices . We 
would conclude that this collection of graphs is in BG(^). 



V 3/2 <- V 3/2 U {V k } , 



Prod(X(G;b;£)) 2 = 1/ 



(10.16) p |y| ||Prod(X(G))|| f , < (C ^) m nl w ^ + \ {v ^ v[ 



(10.17) IVr^flVa/zl + ||Vi/2l - W\ < 



10 ' 
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In order to make the left hand side of (10.17) as large as possible, we should maximize 
V3/2. To continue, we note another formula. Let E(G) be the total number of edges in the 
graph G, and let E(v) be the number of edges in G with one endpoint of the edge being v. 

For v € V3/2 U Vi/2, let F(v) be the number of edges which are specified upon the selection 
of that vertex in our recursive procedure. It is clear that we have E(v) = F(v) if v e V3/2. 
But also, 

£ m = e(g) . 

veV 3/2 UV 1 / 2 

It follows that to maximize the cardinality of V 3 j 2 , those vertices must be in small cliques. 
There are two different classes of graphs which are extremal with respect to these criteria. 

The first extremal class consists of graphs G with all cliques being of size 2, and the 
number of cliques is \V\ - 1. For such graphs, IV3/2I < r^l^ll/ an d if the value is maximal 
then V1/2 is either if |V| is odd, and 1 if |V| is even. It is straight forward to see that the 
maximum of (10.17) occurs at |V| = 5, and is — ^. Here, it is vital that we have already 
discussed the case of two and three vertices! 

The second class are graphs on an even number of vertices, with half the vertices in a 
clique Q, and each vertex v e Q is in one clique of size 2. One can depict such a graph on 
six vertices as 

Vi v 2 v 3 V4 v 5 v e 

* = * = * 

a b c a b c 

The vertices are written in increasing order: V\ < v 2 < v 3 < z; 4 < V5 < zv Note that V\,v 2 ,v 3 
form a single clique of color 2. There are three additional cliques of size 2, all of color 3. 
They are {Vj, Vj +3 } for = 1, 2, 3. For such a graph, it is clear that \V 3 / 2 \ = \\V\, and \V 1/2 \ = l. 6 
The term (10.17) behaves exactly like the first class of extremal graphs on an even number 
of vertices. Our proof is complete. 

□ 
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